add README for AMD ROCm#32
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
|
Thanks for this. Ive just merged a PR to master that is going to conflict. I am now just getting around the AMD project personally. Rather than send you back to square one though with those merge conflicts, feel free to leave this a few days as I will still analyze the approach relative to your merge base. I have a few ideas on how to make this easier esp from a builds point of view. Im going to next couple of days catch up on the history and approach and see where we are at. This is Aimdos next official feature by plans as of this writing. |
|
I gave this a quick test on Linux and it seems to work fine (I'm not sure what made the detour unnecessary on Linux but apparently there is no infinite loop anymore). It'll need the mmap workaround from my PR to avoid the memory leak issue in the runtime. It was fixed upstream but the fix isn't in any release yet as far as I am aware |
|
@rattus128 I'm so glad to hear that. Thank you for your amazing work, aimdo has really helped me solve a lot of problems. |
|
@asagi4 Detour is Windows only hooking tool I think, should be unrelated to Linux. And Linux hooks were just added in version hours ago. So disappointing about memory leak issue fix not released, as disappointed that AMD ignoring some other pytorch issues. |
This comment was marked as outdated.
This comment was marked as outdated.
|
|
||
| def __del__(self): | ||
| if control.lib is not None and hasattr(self, '_ptr') and self._ptr: | ||
| if lib is not None and hasattr(self, '_ptr') and self._ptr: |
There was a problem hiding this comment.
Running example.py and got
Iteration 5
...
Exception ignored in: <function ModelVBAR.__del__ at 0x000001E631BC2D40>
Traceback (most recent call last):
File "H:\ROCm\.venv\Lib\site-packages\comfy_aimdo\model_vbar.py", line 122, in __del__
AttributeError: 'NoneType' object has no attribute 'lib'
Exception ignored in: <function ModelVBAR.__del__ at 0x000001E631BC2D40>
Traceback (most recent call last):
File "H:\ROCm\.venv\Lib\site-packages\comfy_aimdo\model_vbar.py", line 122, in __del__
AttributeError: 'NoneType' object has no attribute 'lib'
so I changed control.lib to lib, but control.lib should also be None, right?
Catch me up, are we leaking memory here in Aimdo or is this pytorch AMD side? I dont have an aimdo mem leak on the radar and would be happy to fix if thats proven. |
the HIP runtime has (had) a bug that prevents memory from being freed properly if there's still any virtual memory range mapped that ever had the memory mapped it it. It's fixed now, but until it's part of an actual release, you can easily work around it with a manual mmap call. See the latest commit in my PR for the workaround. and |
|
I merged, resolved and did some further dev on the @asagi4 PR as #35 . You have some useful README stuff here that we need to figure out. The 7.2.1 from portable README doesn't work for me yet. Im going to give it a bit to see if we get an official update from AMD, but for the moment what you have written in README kinda stands as the best advice. |
|
@rattus128 Thank you for your amazing work! ROCm 7.2.2 was released last week but not including Windows version XD, and there must be a long wait till next release. |
|
I updated code and let's discuss about It's reduced from 16MB to 2MB in 9f2d2fa because of Linux OOM in early time, but as @asagi4 tested this PR's build (which has always been using 16MB) #32 (comment), I don't think |
| dist/ | ||
| *.egg-info/ | ||
| .venv/ | ||
| comfy_aimdo/__pycache__/ |
There was a problem hiding this comment.
For pip editable installation.
|
I think the OOM with the chunk size might've been caused by the same VRAM leak bug. Using a chunk size of 16MB seems to work now on AMD too, though I can't tell if it helps or hurts performance. |
Thanks for confirming! It looks like the OOM was indeed caused by the HIP VRAM leak rather than the chunk size itself. With the unmap workaround in place addressing the root cause, 16MB should be fine on AMD. (With Linux Regarding performance — 16MB should actually be faster than 2MB. Each chunk in As I had made a separate PR for Windows' chunk size #40, feel free to share your opinions. |
DEPRECATED
Motivation
I noticed that 8 out of 16 forks of this repository attempt to add AMD ROCm support. After analyzing their changes, I confirmed that porting aimdo to ROCm primarily requires mapping CUDA driver APIs to their HIP equivalents. However, none of the existing forks support both Linux and Windows simultaneously.
This PR consolidates those community efforts and provides a unified, cross-platform implementation with full CI support for both Linux and Windows ROCm builds.
Changes
src/plat_hip.haimdo-rocm.so(dll)aimdo.so(dll)toaimdo-cuda.so(dll)for distinctiontorch.version.hipbuild-linux-docker-rocmandbuild-win-rocm.ps1for devsexamples/example.pyImpact on ComfyUI
None, I think. This change is fully backward compatible:
comfy_aimdo.control.init()will load the ROCm backend library, but aimdo will not be actively used unless explicitly enabled via--enable-dynamic-vramAdditional requirements
ROCm build on Windows require
clang.exein SDK so I followed your approach made this https://github.com/Apophis3158/comfy-aimdo/releases/tag/v0.0.0You can copy the necessary components from the ROCm SDK rocm_sdk_core-7.2.1-py3-none-win_amd64.whl yourself:
Experimental AMD ROCm support and CI has been achieved by a linkless way, but there are still some things left in the original PR to discuss.